Search Results

Documents authored by Zhang, Hao



Zhang, Hao

Document
Estimating Hourly Population Distribution Patterns at High Spatiotemporal Resolution in Urban Areas Using Geo-Tagged Tweets and Dasymetric Mapping

Authors: Jaehee Park, Hao Zhang, Su Yeon Han, Atsushi Nara, and Ming-Hsiang Tsou

Published in: LIPIcs, Volume 177, 11th International Conference on Geographic Information Science (GIScience 2021) - Part I (2020)


Abstract
This paper introduces a spatiotemporal analysis framework for estimating hourly changing population distribution patterns in urban areas using geo-tagged tweets (the messages containing users’ geospatial locations), land use data, and dasymetric maps. We collected geo-tagged social media (tweets) within the County of San Diego during one year (2015) by using Twitter’s Streaming Application Programming Interfaces (APIs). A semi-manual Twitter content verification procedure for data cleaning was applied first to separate tweets created by humans from non-human users (bots). The next step was to calculate the number of unique Twitter users every hour within census blocks. The final step was to estimate the actual population by transforming the numbers of unique Twitter users in each census block into estimated population densities with spatial and temporal factors using dasymetric maps. The temporal factor was estimated based on hourly changes of Twitter messages within San Diego County, CA. The spatial factor was estimated by using the dasymetric method with land use maps and 2010 census data. Comparing to census data, our methods can provide better estimated population in airports, shopping malls, sports stadiums, zoo and parks, and business areas during the day time.

Cite as

Jaehee Park, Hao Zhang, Su Yeon Han, Atsushi Nara, and Ming-Hsiang Tsou. Estimating Hourly Population Distribution Patterns at High Spatiotemporal Resolution in Urban Areas Using Geo-Tagged Tweets and Dasymetric Mapping. In 11th International Conference on Geographic Information Science (GIScience 2021) - Part I. Leibniz International Proceedings in Informatics (LIPIcs), Volume 177, pp. 10:1-10:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{park_et_al:LIPIcs.GIScience.2021.I.10,
  author =	{Park, Jaehee and Zhang, Hao and Han, Su Yeon and Nara, Atsushi and Tsou, Ming-Hsiang},
  title =	{{Estimating Hourly Population Distribution Patterns at High Spatiotemporal Resolution in Urban Areas Using Geo-Tagged Tweets and Dasymetric Mapping}},
  booktitle =	{11th International Conference on Geographic Information Science (GIScience 2021) - Part I},
  pages =	{10:1--10:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-166-5},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{177},
  editor =	{Janowicz, Krzysztof and Verstegen, Judith A.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.GIScience.2021.I.10},
  URN =		{urn:nbn:de:0030-drops-130456},
  doi =		{10.4230/LIPIcs.GIScience.2021.I.10},
  annote =	{Keywords: Population Estimation, Twitter, Social Media, Dasymetric Map, Spatiotemporal}
}

Zhang, Chihao

Document
Track A: Algorithms, Complexity and Games
A Perfect Sampler for Hypergraph Independent Sets

Authors: Guoliang Qiu, Yanheng Wang, and Chihao Zhang

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)


Abstract
The problem of uniformly sampling hypergraph independent sets is revisited. We design an efficient perfect sampler for the problem under a similar condition of the asymmetric Lovász local lemma. When specialized to d-regular k-uniform hypergraphs on n vertices, our sampler terminates in expected O(n log n) time provided d ≤ c⋅ 2^{k/2} where c > 0 is a constant, matching the rapid mixing condition for Glauber dynamics in Hermon, Sly and Zhang [Hermon et al., 2019]. The analysis of our algorithm is simple and clean.

Cite as

Guoliang Qiu, Yanheng Wang, and Chihao Zhang. A Perfect Sampler for Hypergraph Independent Sets. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 103:1-103:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{qiu_et_al:LIPIcs.ICALP.2022.103,
  author =	{Qiu, Guoliang and Wang, Yanheng and Zhang, Chihao},
  title =	{{A Perfect Sampler for Hypergraph Independent Sets}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{103:1--103:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.103},
  URN =		{urn:nbn:de:0030-drops-164442},
  doi =		{10.4230/LIPIcs.ICALP.2022.103},
  annote =	{Keywords: Coupling from the past, Markov chains, Hypergraph independent sets}
}
Document
Sampling in Potts Model on Sparse Random Graphs

Authors: Yitong Yin and Chihao Zhang

Published in: LIPIcs, Volume 60, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)


Abstract
We study the problem of sampling almost uniform proper q-colorings in sparse Erdos-Renyi random graphs G(n,d/n), a research initiated by Dyer, Flaxman, Frieze and Vigoda [Dyer et al., RANDOM STRUCT ALGOR, 2006]. We obtain a fully polynomial time almost uniform sampler (FPAUS) for the problem provided q>3d+4, improving the current best bound q>5.5d [Efthymiou, SODA, 2014]. Our sampling algorithm works for more generalized models and broader family of sparse graphs. It is an efficient sampler (in the same sense of FPAUS) for anti-ferromagnetic Potts model with activity 0<=b<1 on G(n,d/n) provided q>3(1-b)d+4. We further identify a family of sparse graphs to which all these results can be extended. This family of graphs is characterized by the notion of contraction function, which is a new measure of the average degree in graphs.

Cite as

Yitong Yin and Chihao Zhang. Sampling in Potts Model on Sparse Random Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 47:1-47:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{yin_et_al:LIPIcs.APPROX-RANDOM.2016.47,
  author =	{Yin, Yitong and Zhang, Chihao},
  title =	{{Sampling in Potts Model on Sparse Random Graphs}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)},
  pages =	{47:1--47:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-018-7},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{60},
  editor =	{Jansen, Klaus and Mathieu, Claire and Rolim, Jos\'{e} D. P. and Umans, Chris},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2016.47},
  URN =		{urn:nbn:de:0030-drops-66706},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2016.47},
  annote =	{Keywords: Potts model, Sampling, Random Graph, Approximation Algorithm}
}
Document
FPTAS for Hardcore and Ising Models on Hypergraphs

Authors: Pinyan Lu, Kuan Yang, and Chihao Zhang

Published in: LIPIcs, Volume 47, 33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016)


Abstract
Hardcore and Ising models are two most important families of two state spin systems in statistic physics. Partition function of spin systems is the center concept in statistic physics which connects microscopic particles and their interactions with their macroscopic and statistical properties of materials such as energy, entropy, ferromagnetism, etc. If each local interaction of the system involves only two particles, the system can be described by a graph. In this case, fully polynomial-time approximation scheme (FPTAS) for computing the partition function of both hardcore and anti-ferromagnetic Ising model was designed up to the uniqueness condition of the system. These result are the best possible since approximately computing the partition function beyond this threshold is NP-hard. In this paper, we generalize these results to general physics systems, where each local interaction may involves multiple particles. Such systems are described by hypergraphs. For hardcore model, we also provide FPTAS up to the uniqueness condition, and for anti-ferromagnetic Ising model, we obtain FPTAS under a slightly stronger condition.

Cite as

Pinyan Lu, Kuan Yang, and Chihao Zhang. FPTAS for Hardcore and Ising Models on Hypergraphs. In 33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 47, pp. 51:1-51:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{lu_et_al:LIPIcs.STACS.2016.51,
  author =	{Lu, Pinyan and Yang, Kuan and Zhang, Chihao},
  title =	{{FPTAS for Hardcore and Ising Models on Hypergraphs}},
  booktitle =	{33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016)},
  pages =	{51:1--51:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-001-9},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{47},
  editor =	{Ollinger, Nicolas and Vollmer, Heribert},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2016.51},
  URN =		{urn:nbn:de:0030-drops-57526},
  doi =		{10.4230/LIPIcs.STACS.2016.51},
  annote =	{Keywords: hard-core model, ising model, hypergraph, spatial mixing, correlation decay}
}
Document
The Complexity of Ferromagnetic Two-spin Systems with External Fields

Authors: Jingcheng Liu, Pinyan Lu, and Chihao Zhang

Published in: LIPIcs, Volume 28, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)


Abstract
We study the approximability of computing the partition function for ferromagnetic two-state spin systems. The remarkable algorithm by Jerrum and Sinclair showed that there is a fully polynomial-time randomized approximation scheme (FPRAS) for the special ferromagnetic Ising model with any given uniform external field. Later, Goldberg and Jerrum proved that it is #BIS-hard for Ising model if we allow inconsistent external fields on different nodes. In contrast to these two results, we prove that for any ferromagnetic two-state spin systems except the Ising model, there exists a threshold for external fields beyond which the problem is #BIS-hard, even if the external field is uniform.

Cite as

Jingcheng Liu, Pinyan Lu, and Chihao Zhang. The Complexity of Ferromagnetic Two-spin Systems with External Fields. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014). Leibniz International Proceedings in Informatics (LIPIcs), Volume 28, pp. 843-856, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{liu_et_al:LIPIcs.APPROX-RANDOM.2014.843,
  author =	{Liu, Jingcheng and Lu, Pinyan and Zhang, Chihao},
  title =	{{The Complexity of Ferromagnetic Two-spin Systems with External Fields}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014)},
  pages =	{843--856},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-74-3},
  ISSN =	{1868-8969},
  year =	{2014},
  volume =	{28},
  editor =	{Jansen, Klaus and Rolim, Jos\'{e} and Devanur, Nikhil R. and Moore, Cristopher},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2014.843},
  URN =		{urn:nbn:de:0030-drops-47428},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2014.843},
  annote =	{Keywords: Spin System, #BIS-hard, FPRAS}
}

Zhang, Yuhao

Document
On the Perturbation Function of Ranking and Balance for Weighted Online Bipartite Matching

Authors: Jingxun Liang, Zhihao Gavin Tang, Yixuan Even Xu, Yuhao Zhang, and Renfei Zhou

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
Ranking and Balance are arguably the two most important algorithms in the online matching literature. They achieve the same optimal competitive ratio of 1-1/e for the integral version and fractional version of online bipartite matching by Karp, Vazirani, and Vazirani (STOC 1990) respectively. The two algorithms have been generalized to weighted online bipartite matching problems, including vertex-weighted online bipartite matching and AdWords, by utilizing a perturbation function. The canonical choice of the perturbation function is f(x) = 1-e^{x-1} as it leads to the optimal competitive ratio of 1-1/e in both settings. We advance the understanding of the weighted generalizations of Ranking and Balance in this paper, with a focus on studying the effect of different perturbation functions. First, we prove that the canonical perturbation function is the unique optimal perturbation function for vertex-weighted online bipartite matching. In stark contrast, all perturbation functions achieve the optimal competitive ratio of 1-1/e in the unweighted setting. Second, we prove that the generalization of Ranking to AdWords with unknown budgets using the canonical perturbation function is at most 0.624 competitive, refuting a conjecture of Vazirani (2021). More generally, as an application of the first result, we prove that no perturbation function leads to the prominent competitive ratio of 1-1/e by establishing an upper bound of 1-1/e-0.0003. Finally, we propose the online budget-additive welfare maximization problem that is intermediate between AdWords and AdWords with unknown budgets, and we design an optimal 1-1/e competitive algorithm by generalizing Balance.

Cite as

Jingxun Liang, Zhihao Gavin Tang, Yixuan Even Xu, Yuhao Zhang, and Renfei Zhou. On the Perturbation Function of Ranking and Balance for Weighted Online Bipartite Matching. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 80:1-80:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{liang_et_al:LIPIcs.ESA.2023.80,
  author =	{Liang, Jingxun and Tang, Zhihao Gavin and Xu, Yixuan Even and Zhang, Yuhao and Zhou, Renfei},
  title =	{{On the Perturbation Function of Ranking and Balance for Weighted Online Bipartite Matching}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{80:1--80:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.80},
  URN =		{urn:nbn:de:0030-drops-187334},
  doi =		{10.4230/LIPIcs.ESA.2023.80},
  annote =	{Keywords: Online Matching, AdWords, Ranking, Water-Filling}
}
Document
Improved Algorithms for Online Rent Minimization Problem Under Unit-Size Jobs

Authors: Enze Sun, Zonghan Yang, and Yuhao Zhang

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
We consider the Online Rent Minimization problem, where online jobs with release times, deadlines, and processing times must be scheduled on machines that can be rented for a fixed length period of T. The objective is to minimize the number of machine rents. This problem generalizes the Online Machine Minimization problem where machines can be rented for an infinite period, and both problems have an asymptotically optimal competitive ratio of O(log(p_max/p_min)) for general processing times, where p_max and p_min are the maximum and minimum processing times respectively. However, for small values of p_max/p_min, a better competitive ratio can be achieved by assuming unit-size jobs. Under this assumption, Devanur et al. (2014) gave an optimal e-competitive algorithm for Online Machine Minimization, and Chen and Zhang (2022) gave a (3e+7) ≈ 15.16-competitive algorithm for Online Rent Minimization. In this paper, we significantly improve the competitive ratio of the Online Rent Minimization problem under unit size to 6, by using a clean oracle-based online algorithm framework.

Cite as

Enze Sun, Zonghan Yang, and Yuhao Zhang. Improved Algorithms for Online Rent Minimization Problem Under Unit-Size Jobs. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 97:1-97:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{sun_et_al:LIPIcs.ESA.2023.97,
  author =	{Sun, Enze and Yang, Zonghan and Zhang, Yuhao},
  title =	{{Improved Algorithms for Online Rent Minimization Problem Under Unit-Size Jobs}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{97:1--97:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.97},
  URN =		{urn:nbn:de:0030-drops-187500},
  doi =		{10.4230/LIPIcs.ESA.2023.97},
  annote =	{Keywords: Online Algorithm, Scheduling, Machine Minimization, Rent Minimization}
}
Document
Minimizing the Maximum Flow Time in the Online Food Delivery Problem

Authors: Xiangyu Guo, Kelin Luo, Shi Li, and Yuhao Zhang

Published in: LIPIcs, Volume 248, 33rd International Symposium on Algorithms and Computation (ISAAC 2022)


Abstract
We study a common delivery problem encountered in nowadays online food-ordering platforms: Customers order dishes online, and the restaurant delivers the food after receiving the order. Specifically, we study a problem where k vehicles of capacity c are serving a set of requests ordering food from one restaurant. After a request arrives, it can be served by a vehicle moving from the restaurant to its delivery location. We are interested in serving all requests while minimizing the maximum flow-time, i.e., the maximum time length a customer waits to receive his/her food after submitting the order. We show that the problem is hard in both offline and online settings even when k = 1 and c = ∞: There is a hardness of approximation of Ω(n) for the offline problem, and a lower bound of Ω(n) on the competitive ratio of any online algorithm, where n is number of points in the metric. We circumvent the strong negative results in two directions. Our main result is an O(1)-competitive online algorithm for the uncapacitated (i.e, c = ∞) food delivery problem on tree metrics; we also have negative result showing that the condition c = ∞ is needed. Then we explore the speed-augmentation model where our online algorithm is allowed to use vehicles with faster speed. We show that a moderate speeding factor leads to a constant competitive ratio, and we prove a tight trade-off between the speeding factor and the competitive ratio.

Cite as

Xiangyu Guo, Kelin Luo, Shi Li, and Yuhao Zhang. Minimizing the Maximum Flow Time in the Online Food Delivery Problem. In 33rd International Symposium on Algorithms and Computation (ISAAC 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 248, pp. 33:1-33:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{guo_et_al:LIPIcs.ISAAC.2022.33,
  author =	{Guo, Xiangyu and Luo, Kelin and Li, Shi and Zhang, Yuhao},
  title =	{{Minimizing the Maximum Flow Time in the Online Food Delivery Problem}},
  booktitle =	{33rd International Symposium on Algorithms and Computation (ISAAC 2022)},
  pages =	{33:1--33:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-258-7},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{248},
  editor =	{Bae, Sang Won and Park, Heejin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2022.33},
  URN =		{urn:nbn:de:0030-drops-173181},
  doi =		{10.4230/LIPIcs.ISAAC.2022.33},
  annote =	{Keywords: Online algorithm, Capacitated Vehicle Routing, Flow Time Optimization}
}
Document
Track A: Algorithms, Complexity and Games
Almost Tight Approximation Hardness for Single-Source Directed k-Edge-Connectivity

Authors: Chao Liao, Qingyun Chen, Bundit Laekhanukit, and Yuhao Zhang

Published in: LIPIcs, Volume 229, 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)


Abstract
In the k-outconnected directed Steiner tree problem (k-DST), we are given an n-vertex directed graph G = (V,E) with edge costs, a connectivity requirement k, a root r ∈ V and a set of terminals T ⊆ V. The goal is to find a minimum-cost subgraph H ⊆ G that has k edge-disjoint paths from the root vertex r to every terminal t ∈ T. The problem is NP-hard, and inapproximability results are known in several parameters, e.g., hardness in terms of n: log^{2-ε}n-hardness for k = 1 [Halperin and Krauthgamer, STOC'03], 2^{log^{1-ε}n}-hardness for general case [Cheriyan, Laekhanukit, Naves and Vetta, SODA'12], hardness in terms of k [Cheriyan et al., SODA'12; Laekhanukit, SODA'14; Manurangsi, IPL'19] and hardness in terms of |T| [Laekhanukit, SODA'14]. In this paper, we show the approximation hardness of k-DST for various parameters. - Ω(|T|/log |T|)-approximation hardness, which holds under the standard complexity assumption NP≠ ZPP. The inapproximability ratio is tightened to Ω(|T|) under the Strongish Planted Clique Hypothesis [Manurangsi, Rubinstein and Schramm, ITCS 2021]. The latter hardness result matches the approximation ratio of |T| obtained by a trivial approximation algorithm, thus closing the long-standing open problem. - Ω(2^{k/2} / k)-approximation hardness for the general case of k-DST under the assumption NP≠ZPP. This is the first hardness result known for survivable network design problems with an inapproximability ratio exponential in k. - Ω((k/L)^{L/4})-approximation hardness for k-DST on L-layered graphs for L ≤ O(log n). This almost matches the approximation ratio of O(k^{L-1}⋅ L ⋅ log |T|) achieved in O(n^L)-time due to Laekhanukit [ICALP'16]. We further extend our hardness results in terms of |T| to the undirected cases of k-DST, namely the single-source k-vertex-connected Steiner tree and the k-edge-connected group Steiner tree problems. Thus, we obtain Ω(|T|/log |T|) and Ω(|T|) approximation hardness for both problems under the assumption NP≠ ZPP and the Strongish Planted Clique Hypothesis, respectively. This again matches the upper bound obtained by trivial algorithms.

Cite as

Chao Liao, Qingyun Chen, Bundit Laekhanukit, and Yuhao Zhang. Almost Tight Approximation Hardness for Single-Source Directed k-Edge-Connectivity. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 229, pp. 89:1-89:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{liao_et_al:LIPIcs.ICALP.2022.89,
  author =	{Liao, Chao and Chen, Qingyun and Laekhanukit, Bundit and Zhang, Yuhao},
  title =	{{Almost Tight Approximation Hardness for Single-Source Directed k-Edge-Connectivity}},
  booktitle =	{49th International Colloquium on Automata, Languages, and Programming (ICALP 2022)},
  pages =	{89:1--89:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-235-8},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{229},
  editor =	{Boja\'{n}czyk, Miko{\l}aj and Merelli, Emanuela and Woodruff, David P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2022.89},
  URN =		{urn:nbn:de:0030-drops-164309},
  doi =		{10.4230/LIPIcs.ICALP.2022.89},
  annote =	{Keywords: Directed Steiner Tree, Hardness of Approximation, Fault-Tolerant and Survivable Network Design}
}
Document
APPROX
Polylogarithmic Approximation Algorithm for k-Connected Directed Steiner Tree on Quasi-Bipartite Graphs

Authors: Chun-Hsiang Chan, Bundit Laekhanukit, Hao-Ting Wei, and Yuhao Zhang

Published in: LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)


Abstract
In the k-Connected Directed Steiner Tree problem (k-DST), we are given a directed graph G = (V,E) with edge (or vertex) costs, a root vertex r, a set of q terminals T, and a connectivity requirement k > 0; the goal is to find a minimum-cost subgraph H of G such that H has k edge-disjoint paths from the root r to each terminal in T. The k-DST problem is a natural generalization of the classical Directed Steiner Tree problem (DST) in the fault-tolerant setting in which the solution subgraph is required to have an r,t-path, for every terminal t, even after removing k-1 vertices or edges. Despite being a classical problem, there are not many positive results on the problem, especially for the case k ≥ 3. In this paper, we present an O(log k log q)-approximation algorithm for k-DST when an input graph is quasi-bipartite, i.e., when there is no edge joining two non-terminal vertices. To the best of our knowledge, our algorithm is the only known non-trivial approximation algorithm for k-DST, for k ≥ 3, that runs in polynomial-time Our algorithm is tight for every constant k, due to the hardness result inherited from the Set Cover problem.

Cite as

Chun-Hsiang Chan, Bundit Laekhanukit, Hao-Ting Wei, and Yuhao Zhang. Polylogarithmic Approximation Algorithm for k-Connected Directed Steiner Tree on Quasi-Bipartite Graphs. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 63:1-63:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{chan_et_al:LIPIcs.APPROX/RANDOM.2020.63,
  author =	{Chan, Chun-Hsiang and Laekhanukit, Bundit and Wei, Hao-Ting and Zhang, Yuhao},
  title =	{{Polylogarithmic Approximation Algorithm for k-Connected Directed Steiner Tree on Quasi-Bipartite Graphs}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)},
  pages =	{63:1--63:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-164-1},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{176},
  editor =	{Byrka, Jaros{\l}aw and Meka, Raghu},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.63},
  URN =		{urn:nbn:de:0030-drops-126667},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2020.63},
  annote =	{Keywords: Approximation Algorithms, Network Design, Directed Graphs}
}
Document
Online Makespan Minimization: The Power of Restart

Authors: Zhiyi Huang, Ning Kang, Zhihao Gavin Tang, Xiaowei Wu, and Yuhao Zhang

Published in: LIPIcs, Volume 116, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)


Abstract
We consider the online makespan minimization problem on identical machines. Chen and Vestjens (ORL 1997) show that the largest processing time first (LPT) algorithm is 1.5-competitive. For the special case of two machines, Noga and Seiden (TCS 2001) introduce the SLEEPY algorithm that achieves a competitive ratio of (5 - sqrt{5})/2 ~~ 1.382, matching the lower bound by Chen and Vestjens (ORL 1997). Furthermore, Noga and Seiden note that in many applications one can kill a job and restart it later, and they leave an open problem whether algorithms with restart can obtain better competitive ratios. We resolve this long-standing open problem on the positive end. Our algorithm has a natural rule for killing a processing job: a newly-arrived job replaces the smallest processing job if 1) the new job is larger than other pending jobs, 2) the new job is much larger than the processing one, and 3) the processed portion is small relative to the size of the new job. With appropriate choice of parameters, we show that our algorithm improves the 1.5 competitive ratio for the general case, and the 1.382 competitive ratio for the two-machine case.

Cite as

Zhiyi Huang, Ning Kang, Zhihao Gavin Tang, Xiaowei Wu, and Yuhao Zhang. Online Makespan Minimization: The Power of Restart. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 14:1-14:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.APPROX-RANDOM.2018.14,
  author =	{Huang, Zhiyi and Kang, Ning and Tang, Zhihao Gavin and Wu, Xiaowei and Zhang, Yuhao},
  title =	{{Online Makespan Minimization: The Power of Restart}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018)},
  pages =	{14:1--14:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-085-9},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{116},
  editor =	{Blais, Eric and Jansen, Klaus and D. P. Rolim, Jos\'{e} and Steurer, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2018.14},
  URN =		{urn:nbn:de:0030-drops-94182},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2018.14},
  annote =	{Keywords: Online Scheduling, Makespan Minimization, Identical Machines}
}
Document
Online Vertex-Weighted Bipartite Matching: Beating 1-1/e with Random Arrivals

Authors: Zhiyi Huang, Zhihao Gavin Tang, Xiaowei Wu, and Yuhao Zhang

Published in: LIPIcs, Volume 107, 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)


Abstract
We introduce a weighted version of the ranking algorithm by Karp et al. (STOC 1990), and prove a competitive ratio of 0.6534 for the vertex-weighted online bipartite matching problem when online vertices arrive in random order. Our result shows that random arrivals help beating the 1-1/e barrier even in the vertex-weighted case. We build on the randomized primal-dual framework by Devanur et al. (SODA 2013) and design a two dimensional gain sharing function, which depends not only on the rank of the offline vertex, but also on the arrival time of the online vertex. To our knowledge, this is the first competitive ratio strictly larger than 1-1/e for an online bipartite matching problem achieved under the randomized primal-dual framework. Our algorithm has a natural interpretation that offline vertices offer a larger portion of their weights to the online vertices as time goes by, and each online vertex matches the neighbor with the highest offer at its arrival.

Cite as

Zhiyi Huang, Zhihao Gavin Tang, Xiaowei Wu, and Yuhao Zhang. Online Vertex-Weighted Bipartite Matching: Beating 1-1/e with Random Arrivals. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 79:1-79:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.ICALP.2018.79,
  author =	{Huang, Zhiyi and Tang, Zhihao Gavin and Wu, Xiaowei and Zhang, Yuhao},
  title =	{{Online Vertex-Weighted Bipartite Matching: Beating 1-1/e with Random Arrivals}},
  booktitle =	{45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)},
  pages =	{79:1--79:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-076-7},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{107},
  editor =	{Chatzigiannakis, Ioannis and Kaklamanis, Christos and Marx, D\'{a}niel and Sannella, Donald},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2018.79},
  URN =		{urn:nbn:de:0030-drops-90830},
  doi =		{10.4230/LIPIcs.ICALP.2018.79},
  annote =	{Keywords: Vertex Weighted, Online Bipartite Matching, Randomized Primal-Dual}
}

Zhang, Haoyuan

Document
FHJ: A Formal Model for Hierarchical Dispatching and Overriding

Authors: Yanlin Wang, Haoyuan Zhang, Bruno C. d. S. Oliveira, and Marco Servetto

Published in: LIPIcs, Volume 109, 32nd European Conference on Object-Oriented Programming (ECOOP 2018)


Abstract
Multiple inheritance is a valuable feature for Object-Oriented Programming. However, it is also tricky to get right, as illustrated by the extensive literature on the topic. A key issue is the ambiguity arising from inheriting multiple parents, which can have conflicting methods. Numerous existing work provides solutions for conflicts which arise from diamond inheritance: i.e. conflicts that arise from implementations sharing a common ancestor. However, most mechanisms are inadequate to deal with unintentional method conflicts: conflicts which arise from two unrelated methods that happen to share the same name and signature. This paper presents a new model called Featherweight Hierarchical Java (FHJ) that deals with unintentional method conflicts. In our new model, which is partly inspired by C++, conflicting methods arising from unrelated methods can coexist in the same class, and hierarchical dispatching supports unambiguous lookups in the presence of such conflicting methods. To avoid ambiguity, hierarchical information is employed in method dispatching, which uses a combination of static and dynamic type information to choose the implementation of a method at run-time. Furthermore, unlike all existing inheritance models, our model supports hierarchical method overriding: that is, methods can be independently overridden along the multiple inheritance hierarchy. We give illustrative examples of our language and features and formalize FHJ as a minimal Featherweight-Java style calculus.

Cite as

Yanlin Wang, Haoyuan Zhang, Bruno C. d. S. Oliveira, and Marco Servetto. FHJ: A Formal Model for Hierarchical Dispatching and Overriding. In 32nd European Conference on Object-Oriented Programming (ECOOP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 109, pp. 20:1-20:30, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{wang_et_al:LIPIcs.ECOOP.2018.20,
  author =	{Wang, Yanlin and Zhang, Haoyuan and Oliveira, Bruno C. d. S. and Servetto, Marco},
  title =	{{FHJ: A Formal Model for Hierarchical Dispatching and Overriding}},
  booktitle =	{32nd European Conference on Object-Oriented Programming (ECOOP 2018)},
  pages =	{20:1--20:30},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-079-8},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{109},
  editor =	{Millstein, Todd},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ECOOP.2018.20},
  URN =		{urn:nbn:de:0030-drops-92259},
  doi =		{10.4230/LIPIcs.ECOOP.2018.20},
  annote =	{Keywords: multiple inheritance, hierarchical dispatching, OOP, language design}
}

Zhang, Haozhe

Document
Conjunctive Queries with Free Access Patterns Under Updates

Authors: Ahmet Kara, Milos Nikolic, Dan Olteanu, and Haozhe Zhang

Published in: LIPIcs, Volume 255, 26th International Conference on Database Theory (ICDT 2023)


Abstract
We study the problem of answering conjunctive queries with free access patterns under updates. A free access pattern is a partition of the free variables of the query into input and output. The query returns tuples over the output variables given a tuple of values over the input variables. We introduce a fully dynamic evaluation approach for such queries. We also give a syntactic characterisation of those queries that admit constant time per single-tuple update and whose output tuples can be enumerated with constant delay given an input tuple. Finally, we chart the complexity trade-off between the preprocessing time, update time and enumeration delay for such queries. For a class of queries, our approach achieves optimal, albeit non-constant, update time and delay. Their optimality is predicated on the Online Matrix-Vector Multiplication conjecture. Our results recover prior work on the dynamic evaluation of conjunctive queries without access patterns.

Cite as

Ahmet Kara, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. Conjunctive Queries with Free Access Patterns Under Updates. In 26th International Conference on Database Theory (ICDT 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 255, pp. 17:1-17:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{kara_et_al:LIPIcs.ICDT.2023.17,
  author =	{Kara, Ahmet and Nikolic, Milos and Olteanu, Dan and Zhang, Haozhe},
  title =	{{Conjunctive Queries with Free Access Patterns Under Updates}},
  booktitle =	{26th International Conference on Database Theory (ICDT 2023)},
  pages =	{17:1--17:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-270-9},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{255},
  editor =	{Geerts, Floris and Vandevoort, Brecht},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2023.17},
  URN =		{urn:nbn:de:0030-drops-177599},
  doi =		{10.4230/LIPIcs.ICDT.2023.17},
  annote =	{Keywords: fully dynamic algorithm, enumeration delay, complexity trade-off, dichotomy}
}
Document
Evaluation Trade-Offs for Acyclic Conjunctive Queries

Authors: Ahmet Kara, Milos Nikolic, Dan Olteanu, and Haozhe Zhang

Published in: LIPIcs, Volume 252, 31st EACSL Annual Conference on Computer Science Logic (CSL 2023)


Abstract
We consider the evaluation of acyclic conjunctive queries, where the evaluation time is decomposed into preprocessing time and enumeration delay. In a seminal paper at CSL'07, Bagan, Durand, and Grandjean showed that acyclic queries can be evaluated with linear preprocessing time and linear enumeration delay. If the query is free-connex, the enumeration delay becomes constant. Further prior work showed that constant enumeration delay can be achieved for arbitrary acyclic conjunctive queries at the expense of a preprocessing time that is characterised by the fractional hypertree width. We introduce an approach that exposes a trade-off between preprocessing time and enumeration delay for acyclic conjunctive queries. The aforementioned prior works represent extremes in this trade-off space. Yet our approach also allows for the enumeration delay and the preprocessing time between these extremes, in particular the delay may lie between constant and linear time. Our approach decomposes the given query into subqueries and achieves for each subquery a trade-off that depends on a parameter controlling the times for preprocessing and enumeration. The complexity of the query is given by the Pareto optimal points of a bi-objective optimisation program whose inputs are possible query decompositions and parameter values.

Cite as

Ahmet Kara, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. Evaluation Trade-Offs for Acyclic Conjunctive Queries. In 31st EACSL Annual Conference on Computer Science Logic (CSL 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 252, pp. 29:1-29:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{kara_et_al:LIPIcs.CSL.2023.29,
  author =	{Kara, Ahmet and Nikolic, Milos and Olteanu, Dan and Zhang, Haozhe},
  title =	{{Evaluation Trade-Offs for Acyclic Conjunctive Queries}},
  booktitle =	{31st EACSL Annual Conference on Computer Science Logic (CSL 2023)},
  pages =	{29:1--29:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-264-8},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{252},
  editor =	{Klin, Bartek and Pimentel, Elaine},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CSL.2023.29},
  URN =		{urn:nbn:de:0030-drops-174907},
  doi =		{10.4230/LIPIcs.CSL.2023.29},
  annote =	{Keywords: acyclic queries, query evaluation, enumeration delay}
}
Document
Counting Triangles under Updates in Worst-Case Optimal Time

Authors: Ahmet Kara, Hung Q. Ngo, Milos Nikolic, Dan Olteanu, and Haozhe Zhang

Published in: LIPIcs, Volume 127, 22nd International Conference on Database Theory (ICDT 2019)


Abstract
We consider the problem of incrementally maintaining the triangle count query under single-tuple updates to the input relations. We introduce an approach that exhibits a space-time tradeoff such that the space-time product is quadratic in the size of the input database and the update time can be as low as the square root of this size. This lowest update time is worst-case optimal conditioned on the Online Matrix-Vector Multiplication conjecture. The classical and factorized incremental view maintenance approaches are recovered as special cases of our approach within the space-time tradeoff. In particular, they require linear-time maintenance under updates, which is suboptimal. Our approach can also count all triangles in a static database in the worst-case optimal time needed for enumerating them.

Cite as

Ahmet Kara, Hung Q. Ngo, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. Counting Triangles under Updates in Worst-Case Optimal Time. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 4:1-4:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{kara_et_al:LIPIcs.ICDT.2019.4,
  author =	{Kara, Ahmet and Ngo, Hung Q. and Nikolic, Milos and Olteanu, Dan and Zhang, Haozhe},
  title =	{{Counting Triangles under Updates in Worst-Case Optimal Time}},
  booktitle =	{22nd International Conference on Database Theory (ICDT 2019)},
  pages =	{4:1--4:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-101-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{127},
  editor =	{Barcelo, Pablo and Calautti, Marco},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2019.4},
  URN =		{urn:nbn:de:0030-drops-103068},
  doi =		{10.4230/LIPIcs.ICDT.2019.4},
  annote =	{Keywords: incremental view maintenance, amortized analysis, data skew}
}

Zhang, Haowen

Document
Validating Paired-End Read Alignments in Sequence Graphs

Authors: Chirag Jain, Haowen Zhang, Alexander Dilthey, and Srinivas Aluru

Published in: LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)


Abstract
Graph based non-linear reference structures such as variation graphs and colored de Bruijn graphs enable incorporation of full genomic diversity within a population. However, transitioning from a simple string-based reference to graphs requires addressing many computational challenges, one of which concerns accurately mapping sequencing read sets to graphs. Paired-end Illumina sequencing is a commonly used sequencing platform in genomics, where the paired-end distance constraints allow disambiguation of repeats. Many recent works have explored provably good index-based and alignment-based strategies for mapping individual reads to graphs. However, validating distance constraints efficiently over graphs is not trivial, and existing sequence to graph mappers rely on heuristics. We introduce a mathematical formulation of the problem, and provide a new algorithm to solve it exactly. We take advantage of the high sparsity of reference graphs, and use sparse matrix-matrix multiplications (SpGEMM) to build an index which can be queried efficiently by a mapping algorithm for validating the distance constraints. Effectiveness of the algorithm is demonstrated using real reference graphs, including a human MHC variation graph, and a pan-genome de-Bruijn graph built using genomes of 20 B. anthracis strains. While the one-time indexing time can vary from a few minutes to a few hours using our algorithm, answering a million distance queries takes less than a second.

Cite as

Chirag Jain, Haowen Zhang, Alexander Dilthey, and Srinivas Aluru. Validating Paired-End Read Alignments in Sequence Graphs. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 17:1-17:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{jain_et_al:LIPIcs.WABI.2019.17,
  author =	{Jain, Chirag and Zhang, Haowen and Dilthey, Alexander and Aluru, Srinivas},
  title =	{{Validating Paired-End Read Alignments in Sequence Graphs}},
  booktitle =	{19th International Workshop on Algorithms in Bioinformatics (WABI 2019)},
  pages =	{17:1--17:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-123-8},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{143},
  editor =	{Huber, Katharina T. and Gusfield, Dan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.17},
  URN =		{urn:nbn:de:0030-drops-110470},
  doi =		{10.4230/LIPIcs.WABI.2019.17},
  annote =	{Keywords: Sequence graphs, read mapping, index, sparse matrix-matrix multiplication}
}

Zhang, Shaojie

Document
Efficient Haplotype Block Matching in Bi-Directional PBWT

Authors: Ardalan Naseri, William Yue, Shaojie Zhang, and Degui Zhi

Published in: LIPIcs, Volume 201, 21st International Workshop on Algorithms in Bioinformatics (WABI 2021)


Abstract
Efficient haplotype matching search is of great interest when large genotyped cohorts are becoming available. Positional Burrows-Wheeler Transform (PBWT) enables efficient searching for blocks of haplotype matches. However, existing efficient PBWT algorithms sweep across the haplotype panel from left to right, capturing all exact matches. As a result, PBWT does not account for mismatches. It is also not easy to investigate the patterns of changes between the matching blocks. Here, we present an extension to PBWT, called bi-directional PBWT that allows the information about the blocks of matches to be present at both sides of each site. We also present a set of algorithms to efficiently merge the matching blocks or examine the patterns of changes on both sides of each site. The time complexity of the algorithms to find and merge matching blocks using bi-directional PBWT is linear to the input size. Using real data from the UK Biobank, we demonstrate the run time and memory efficiency of our algorithms. More importantly, our algorithms can identify more blocks by enabling tolerance of mismatches. Moreover, by using mutual information (MI) between the forward and the reverse PBWT matching block sets as a measure of haplotype consistency, we found the MI derived from European samples in the 1000 Genomes Project is highly correlated (Spearman correlation r=0.87) with the deCODE recombination map.

Cite as

Ardalan Naseri, William Yue, Shaojie Zhang, and Degui Zhi. Efficient Haplotype Block Matching in Bi-Directional PBWT. In 21st International Workshop on Algorithms in Bioinformatics (WABI 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 201, pp. 19:1-19:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{naseri_et_al:LIPIcs.WABI.2021.19,
  author =	{Naseri, Ardalan and Yue, William and Zhang, Shaojie and Zhi, Degui},
  title =	{{Efficient Haplotype Block Matching in Bi-Directional PBWT}},
  booktitle =	{21st International Workshop on Algorithms in Bioinformatics (WABI 2021)},
  pages =	{19:1--19:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-200-6},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{201},
  editor =	{Carbone, Alessandra and El-Kebir, Mohammed},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2021.19},
  URN =		{urn:nbn:de:0030-drops-143729},
  doi =		{10.4230/LIPIcs.WABI.2021.19},
  annote =	{Keywords: PBWT, Bi-directional, Haplotype Matching}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail